智能论文笔记

Snapshot HDR Video Construction Using Coded Mask

Masheal Alghamdi , Qiang Fu , Ali Thabet , Wolfgang Heidrich

分类：计算机视觉

2021-12-05

本文研究了从快照编码的LDR视频重建高动态范围（HDR）视频。构建HDR视频需要为每个帧恢复HDR值并保持连续帧之间的一致性。从单个图像捕获的HDR图像获取，也称为快照HDR成像，可以通过多种方式实现。例如，通过将光学元件引入相机的光学堆叠来实现可重新配置的快照HDR相机;通过将编码掩模放置在传感器前方的小支座距离处。可以使用深度学习方法从捕获的编码图像中恢复高质量的HDR图像。本研究利用3D-CNNS从编码LDR视频执行联合去脱模，去噪和HDR视频重建。我们通过引入考虑短期和长期一致性的时间损耗函数来执行更季度一致的HDR视频重建。获得的结果是有前途的，可以使用传统相机导致经济实惠的HDR视频捕获。

translated by 谷歌翻译

ISP-Agnostic Image Reconstruction for Under-Display Cameras

Miao Qi , Yuqi Li , Wolfgang Heidrich

分类：计算机视觉

2021-11-02

近年来已经提出了显示屏下的显示器，作为减少移动设备的形状因子的方式，同时最大化屏幕区域。不幸的是，将相机放在屏幕后面导致显着的图像扭曲，包括对比度，模糊，噪音，色移，散射伪像和降低光敏性的损失。在本文中，我们提出了一种图像恢复管道，其是ISP-Annostic，即它可以与任何传统ISP组合，以产生使用相同的ISP与常规相机外观匹配的最终图像。这是通过执行Raw-Raw Image Restoration的深度学习方法来实现的。为了获得具有足够对比度和场景多样性的大量实际展示摄像机培训数据，我们还开发利用HDR监视器的数据捕获方法，以及数据增强方法以产生合适的HDR内容。监视器数据补充有现实世界的数据，该数据具有较少的场景分集，但允许我们实现细节恢复而不受监视器分辨率的限制。在一起，这种方法成功地恢复了颜色和对比度以及图像细节。

translated by 谷歌翻译

Diffractive lensless imaging with optimized Voronoi-Fresnel phase

Qiang Fu , Dong-Ming Yan , Wolfgang Heidrich

分类：计算机视觉

2021-09-28

Lensless cameras are a class of imaging devices that shrink the physical dimensions to the very close vicinity of the image sensor by replacing conventional compound lenses with integrated flat optics and computational algorithms. Here we report a diffractive lensless camera with spatially-coded Voronoi-Fresnel phase to achieve superior image quality. We propose a design principle of maximizing the acquired information in optics to facilitate the computational reconstruction. By introducing an easy-to-optimize Fourier domain metric, Modulation Transfer Function volume (MTFv), which is related to the Strehl ratio, we devise an optimization framework to guide the optimization of the diffractive optical element. The resulting Voronoi-Fresnel phase features an irregular array of quasi-Centroidal Voronoi cells containing a base first-order Fresnel phase function. We demonstrate and verify the imaging performance for photography applications with a prototype Voronoi-Fresnel lensless camera on a 1.6-megapixel image sensor in various illumination conditions. Results show that the proposed design outperforms existing lensless cameras, and could benefit the development of compact imaging systems that work in extreme physical conditions.

translated by 谷歌翻译

Robustifying Markowitz

Wolfgang Karl Härdle , Yegor Klochkov , Alla Petukhina , Nikita Zhivotovskiy

分类：机器学习

2022-12-28

Markowitz mean-variance portfolios with sample mean and covariance as input parameters feature numerous issues in practice. They perform poorly out of sample due to estimation error, they experience extreme weights together with high sensitivity to change in input parameters. The heavy-tail characteristics of financial time series are in fact the cause for these erratic fluctuations of weights that consequently create substantial transaction costs. In robustifying the weights we present a toolbox for stabilizing costs and weights for global minimum Markowitz portfolios. Utilizing a projected gradient descent (PGD) technique, we avoid the estimation and inversion of the covariance operator as a whole and concentrate on robust estimation of the gradient descent increment. Using modern tools of robust statistics we construct a computationally efficient estimator with almost Gaussian properties based on median-of-means uniformly over weights. This robustified Markowitz approach is confirmed by empirical studies on equity markets. We demonstrate that robustified portfolios reach the lowest turnover compared to shrinkage-based and constrained portfolios while preserving or slightly improving out-of-sample performance.

translated by 谷歌翻译

Automatic Semantic Modeling for Structural Data Source with the Prior Knowledge from Knowledge Base

Jiakang Xu , Wolfgang Mayer , HongYu Zhang , Keqing He , Zaiwen Feng

分类：人工智能

2022-12-21

A critical step in sharing semantic content online is to map the structural data source to a public domain ontology. This problem is denoted as the Relational-To-Ontology Mapping Problem (Rel2Onto). A huge effort and expertise are required for manually modeling the semantics of data. Therefore, an automatic approach for learning the semantics of a data source is desirable. Most of the existing work studies the semantic annotation of source attributes. However, although critical, the research for automatically inferring the relationships between attributes is very limited. In this paper, we propose a novel method for semantically annotating structured data sources using machine learning, graph matching and modified frequent subgraph mining to amend the candidate model. In our work, Knowledge graph is used as prior knowledge. Our evaluation shows that our approach outperforms two state-of-the-art solutions in tricky cases where only a few semantic models are known.

translated by 谷歌翻译

Does It Affect You? Social and Learning Implications of Using Cognitive-Affective State Recognition for Proactive Human-Robot Tutoring

Matthias Kraus , Diana Betancourt , Wolfgang Minker

分类：机器人 | 自然语言处理

2022-12-20

Using robots in educational contexts has already shown to be beneficial for a student's learning and social behaviour. For levitating them to the next level of providing more effective and human-like tutoring, the ability to adapt to the user and to express proactivity is fundamental. By acting proactively, intelligent robotic tutors anticipate possible situations where problems for the student may arise and act in advance for preventing negative outcomes. Still, the decisions of when and how to behave proactively are open questions. Therefore, this paper deals with the investigation of how the student's cognitive-affective states can be used by a robotic tutor for triggering proactive tutoring dialogue. In doing so, it is aimed to improve the learning experience. For this reason, a concept learning task scenario was observed where a robotic assistant proactively helped when negative user states were detected. In a learning task, the user's states of frustration and confusion were deemed to have negative effects on the outcome of the task and were used to trigger proactive behaviour. In an empirical user study with 40 undergraduate and doctoral students, we studied whether the initiation of proactive behaviour after the detection of signs of confusion and frustration improves the student's concentration and trust in the agent. Additionally, we investigated which level of proactive dialogue is useful for promoting the student's concentration and trust. The results show that high proactive behaviour harms trust, especially when triggered during negative cognitive-affective states but contributes to keeping the student focused on the task when triggered in these states. Based on our study results, we further discuss future steps for improving the proactive assistance of robotic tutoring systems.

translated by 谷歌翻译

Mu$^{2}$SLAM: Multitask, Multilingual Speech and Language Models

Yong Cheng , Yu Zhang , Melvin Johnson , Wolfgang Macherey , Ankur Bapna

分类：自然语言处理

2022-12-19

We present Mu$^{2}$SLAM, a multilingual sequence-to-sequence model pre-trained jointly on unlabeled speech, unlabeled text and supervised data spanning Automatic Speech Recognition (ASR), Automatic Speech Translation (AST) and Machine Translation (MT), in over 100 languages. By leveraging a quantized representation of speech as a target, Mu$^{2}$SLAM trains the speech-text models with a sequence-to-sequence masked denoising objective similar to T5 on the decoder and a masked language modeling (MLM) objective on the encoder, for both unlabeled speech and text, while utilizing the supervised tasks to improve cross-lingual and cross-modal representation alignment within the model. On CoVoST AST, Mu$^{2}$SLAM establishes a new state-of-the-art for models trained on public datasets, improving on xx-en translation over the previous best by 1.9 BLEU points and on en-xx translation by 1.1 BLEU points. On Voxpopuli ASR, our model matches the performance of an mSLAM model fine-tuned with an RNN-T decoder, despite using a relatively weaker sequence-to-sequence architecture. On text understanding tasks, our model improves by more than 6\% over mSLAM on XNLI, getting closer to the performance of mT5 models of comparable capacity on XNLI and TydiQA, paving the way towards a single model for all speech and text understanding tasks.

translated by 谷歌翻译

Interpolation with the polynomial kernels

Giacomo Elefante , Wolfgang Erb , Francesco Marchetti , Emma Perracchione , Davide Poggiali , Gabriele Santin

分类：机器学习

2022-12-15

The polynomial kernels are widely used in machine learning and they are one of the default choices to develop kernel-based classification and regression models. However, they are rarely used and considered in numerical analysis due to their lack of strict positive definiteness. In particular they do not enjoy the usual property of unisolvency for arbitrary point sets, which is one of the key properties used to build kernel-based interpolation methods. This paper is devoted to establish some initial results for the study of these kernels, and their related interpolation algorithms, in the context of approximation theory. We will first prove necessary and sufficient conditions on point sets which guarantee the existence and uniqueness of an interpolant. We will then study the Reproducing Kernel Hilbert Spaces (or native spaces) of these kernels and their norms, and provide inclusion relations between spaces corresponding to different kernel parameters. With these spaces at hand, it will be further possible to derive generic error estimates which apply to sufficiently smooth functions, thus escaping the native space. Finally, we will show how to employ an efficient stable algorithm to these kernels to obtain accurate interpolants, and we will test them in some numerical experiment. After this analysis several computational and theoretical aspects remain open, and we will outline possible further research directions in a concluding section. This work builds some bridges between kernel and polynomial interpolation, two topics to which the authors, to different extents, have been introduced under the supervision or through the work of Stefano De Marchi. For this reason, they wish to dedicate this work to him in the occasion of his 60th birthday.

translated by 谷歌翻译

AutoPV: Automated photovoltaic forecasts with limited information using an ensemble of pre-trained models

Stefan Meisenbacher , Benedikt Heidrich , Tim Martin , Ralf Mikut , Veit Hagenmeyer

分类：机器学习

2022-12-13

Accurate PhotoVoltaic (PV) power generation forecasting is vital for the efficient operation of Smart Grids. The automated design of such accurate forecasting models for individual PV plants includes two challenges: First, information about the PV mounting configuration (i.e. inclination and azimuth angles) is often missing. Second, for new PV plants, the amount of historical data available to train a forecasting model is limited (cold-start problem). We address these two challenges by proposing a new method for day-ahead PV power generation forecasts called AutoPV. AutoPV is a weighted ensemble of forecasting models that represent different PV mounting configurations. This representation is achieved by pre-training each forecasting model on a separate PV plant and by scaling the model's output with the peak power rating of the corresponding PV plant. To tackle the cold-start problem, we initially weight each forecasting model in the ensemble equally. To tackle the problem of missing information about the PV mounting configuration, we use new data that become available during operation to adapt the ensemble weights to minimize the forecasting error. AutoPV is advantageous as the unknown PV mounting configuration is implicitly reflected in the ensemble weights, and only the PV plant's peak power rating is required to re-scale the ensemble's output. AutoPV also allows to represent PV plants with panels distributed on different roofs with varying alignments, as these mounting configurations can be reflected proportionally in the weighting. Additionally, the required computing memory is decoupled when scaling AutoPV to hundreds of PV plants, which is beneficial in Smart Grids with limited computing capabilities. For a real-world data set with 11 PV plants, the accuracy of AutoPV is comparable to a model trained on two years of data and outperforms an incrementally trained model.

translated by 谷歌翻译

Implementing Neural Network-Based Equalizers in a Coherent Optical Transmission System Using Field-Programmable Gate Arrays

Pedro J. Freire , Sasipim Srivallapanondh , Michael Anderson , Bernhard Spinnler , Thomas Bex , Tobias A. Eriksson , Antonio Napoli , Wolfgang Schairer , Nelson Costa , Michaela Blott

分类：机器学习

2022-12-09

In this work, we demonstrate the offline FPGA realization of both recurrent and feedforward neural network (NN)-based equalizers for nonlinearity compensation in coherent optical transmission systems. First, we present a realization pipeline showing the conversion of the models from Python libraries to the FPGA chip synthesis and implementation. Then, we review the main alternatives for the hardware implementation of nonlinear activation functions. The main results are divided into three parts: a performance comparison, an analysis of how activation functions are implemented, and a report on the complexity of the hardware. The performance in Q-factor is presented for the cases of bidirectional long-short-term memory coupled with convolutional NN (biLSTM + CNN) equalizer, CNN equalizer, and standard 1-StpS digital back-propagation (DBP) for the simulation and experiment propagation of a single channel dual-polarization (SC-DP) 16QAM at 34 GBd along 17x70km of LEAF. The biLSTM+CNN equalizer provides a similar result to DBP and a 1.7 dB Q-factor gain compared with the chromatic dispersion compensation baseline in the experimental dataset. After that, we assess the Q-factor and the impact of hardware utilization when approximating the activation functions of NN using Taylor series, piecewise linear, and look-up table (LUT) approximations. We also show how to mitigate the approximation errors with extra training and provide some insights into possible gradient problems in the LUT approximation. Finally, to evaluate the complexity of hardware implementation to achieve 400G throughput, fixed-point NN-based equalizers with approximated activation functions are developed and implemented in an FPGA.

translated by 谷歌翻译